The Problem of Missing Values in Decision Tree Grafting
نویسنده
چکیده
Decision tree grafting adds nodes to inferred decision trees. Previous research has demonstrated that appropriate grafting techniques can improve predictive accuracy across a wide cross-selection of domains. However, previous decision tree grafting systems are demonstrated to have a serious deeciency for some data sets containing missing values. This problem arises due to the method for handling missing values employed by C4.5, in which the grafting systems have been embedded. This paper provides an explanation of and solution to the problem. Experimental evidence is presented of the eecacy of this solution.
منابع مشابه
Identification of the most important factors of ethnic differences in anthropometric dimensions of Iranian workers using the decision tree
Background and aims: Anthropometry is the branch of human science that considers the physical measurement of the human body, especially size and shape. One application of anthropometrical data in ergonomics is the design of working space and the development of industrialized products. So that the tools, equipment and workstations, which designed based on the physical dimensions of the workers, ...
متن کاملPerformance evaluation of different estimation methods for missing rainfall data
There are numerous methods to estimate missing values of which some are used depending on the data type and regional climatic characteristics. In this research, part of the monthly precipitation data in Sarab synoptic station, east Azerbaijan province, Iran was randomly considered missing values. In order to study the effectiveness of various methods to estimate missing data, by seven classic s...
متن کاملA New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملLearning Decision Tree Classifiers from Attribute Value Taxonomies and Partially Specified Data
We consider the problem of learning to classify partially specified instances i.e., instances that are described in terms of attribute values at different levels of precision, using user-supplied attribute value taxonomies (AVT). We formalize the problem of learning from AVT and data and present an AVT-guided decision tree learning algorithm (AVT-DTL) to learn classification rules at multiple l...
متن کاملPrivacy-Preserving Imputation of Missing
Handling missing data is a critical step to ensuring good results in data mining. Like most data mining algorithms, existing privacy-preserving data mining algorithms assume data is complete. In order to maintain privacy in the data mining process while cleaning data, privacy-preserving methods of data cleaning will be required. In this paper, we address the problem of privacy-preserving data i...
متن کامل